Novel HCV core+1 protein, methods for diagnosis of HCV infections, prophylaxis, and for screening of anti-HCV agents

ABSTRACT

The present invention relates to a novel form of core+1 protein of Hepatitis C virus (HCV), designated shorter form core+1 protein. The shorter form core+1 protein of Hepatitis C virus is the product of translation of a coding sequence consisting of all or part of a nucleotide sequence extending from nucleotide 598 to nucleotide 920 within the core+1 ORF of HCV represented on FIG.  3 B. The invention also provides methods for detecting infection by Hepatitis C virus in biological samples, methods of screening compounds which interact with viral propagation in HCV infected cells or screening of compounds impaction on the expression of shorter form core+1 protein and uses of these compounds for the preparation of compositions useful for their anti-viral activities.

The present invention relates to a novel form of core+1 protein ofHepatitis C virus (HCV), designated shorter form core+1 protein. Theinvention also provides methods for detecting infection by Hepatitis Cvirus in biological samples, methods of screening compounds whichinteract with viral propagation in HCV infected cells and advantageouslydecrease inhibit or prevent viral propagation or screening of compoundsimpaction on the expression of shorter form core+1 protein and uses ofthese compounds for the preparation of compositions useful for theiranti-viral activities. The invention also proposes to use the shorterform core+1 protein of the invention to derive immunogenic compositionsfor protection against HCV infection or against its consequences.

Hepatitis C is a viral infection of the liver which has also beenreferred to as “non A, non B hepatitis” (NANBH) until identification ofthe causative agent. Hepatitis C virus is one of the viruses (A, B, C, Dand E), which together account for the majority of cases of viralhepatitis. Hepatitis C virus was first identified in 1989 (Choo et al.1989) and defined as a common cause of liver disease with an estimated170-million infected people worldwide. Hepatitis C virus (HCV) infectionaffects the liver, which causes hepatitis, i.e., an inflammation of theliver. 75 to 85% of persons infected with HCV progress to chronicinfection, approximately 20% of these cases develop complications ofchronic hepatitis C, including cirrhosis of the liver or hepatocellularcarcinoma after 20 years of infection (Di Bisceglie 2000). The currentrecommended treatment for HCV infections is a combination of interferonand ribavirin drugs, however the treatment is not effective in all casesand the liver transplantation is indicated in hepatitis C-relatedend-stage liver disease. At present, there is no vaccine available toprevent HCV infection, therefore all precautions to avoid infection mustbe taken.

HCV is a (+) sense single-stranded enveloped RNA virus in theHepacivirus genus within the Flaviviridae family. The viral genome isapproximately 10 kb in length and encodes a 3011 amino acid polyproteinprecursor. The HCV genome has a large single open reading frame (ORF)coding for a unique polyprotein, said polyprotein being co- andpost-translationally processed by cellular and viral proteases intothree structural protein, i.e., core, E1 and E2 and at least sixnon-structural NS2, NS3, NS4A, NS4B, NS5A and NS5B proteins (Houghton1996 and Reed et al. 2000).

Initiation of translation of the HCV genome is controlled by an internalribosome entry site (IRES) located mainly within the 5′-non codingregion of the viral RNA, between nucleotides 42 and 341 or 356, the 3′limit being controversial. The core protein, which forms the viralnucleocapsid, is predicted to be 191 amino acids in length and to have amolecular mass of 23 kDa (p23). Further processing of p23 produces themature core protein (p21), consisting of between 173-182 amino acids. Ithas been previously reported that a protein having a molecular weight ofabout 17 kDa is also expressed from the core protein-coding sequence ofsome HCV isolates both in vitro and in vivo, e.g. in E. coli cells. Thisadditional HCV polypeptide of 16/17 kDa (p16/p17), consisting of maximum160 amino acids, is encoded by the open reading frame that overlaps thecore gene in the +1 frame (core+1 ORF) and is syntheTized in vitro as aresult of a +1 ribosomal frameshift for translation.

This 16/17 kDa polypeptide is named ARFP for Alternative Reading FrameProtein or F for Frameshift protein or core+1 according to the locationof this novel protein. The ARFP/F/core+1 protein is synthetized in vitrofrom the initiator codon of the polyprotein sequence followed by a +1ribosomal frameshift operating in the region of core codons 8-14 (Xu etal. 2001, Varaklioti et al. 2002).

More recently, the expression of the core+1 protein coding sequence hasbeen assayed in mammalian cells, i.e. in vivo, in order to investigatethe biological importance of the core+1 protein. It has been shown thatexpression of the core+1 ORF of HCV-1 and of HCV-1a (H) in rabbitreticulocyte lysates (in vitro) can be obtained respectively for HCV-1isolate whereas it is not detected for HCV-1a(H) isolate (Varaklioti etal. 2002). Indeed, the core+1 protein has been synthesized in vitro whenexpressing core +1 ORF from HCV-1 but has not been detected whenexpressing core+1 ORF from HCV-1a (H). It is reminded that HCV-1 andHCV-1a(H) isolates of HCV, although belonging to the same genotype, havedifferent sequences at the frameshift site located in codons 8-14 ofHCV-1. The difference especially consists in the lack of the 10-Anucleotide residues in the HCV-1a(H) sequence at the putative frameshiftsite. In order to provide some data on expression mechanisms of core+1protein the inventors have studied said expression in vivo.

The results disclosed in the present invention indicate that, unlike tothe in vitro expression studies, both HCV-1 and HCV-1a (H) core codingsequences efficiently allow expression of the core+1 ORF in transfectedmammalian cells. The transfection and expression experiments carried outin mammalian cells have also enabled the present inventors to identifythat in vivo expression of core+1 ORF is associated with synthesis of anew protein which expression follows a new alternative translationinitiation mechanism of core+1 ORF when compared to the mechanismidentified for the in vitro expression of core+1 protein. Saidalternative mechanism directs the synthesis of a shorter form of core+1protein, in vivo.

Particular species of HCV-1 and HCV-1a (H) have been disclosed,respectively, in Genebank under references No. M62321 and No. M67463.

Viruses, which are subject to genome size constraints have developeddifferent strategies to expand their coding capacity, such as ribosomalframeshifting or internal translational initiation. The ribosomalframeshifting consists in avoiding a termination codon, which wouldotherwise have been encountered by the ribosome, and instead creates aprotein with extra amino acid sequences at its C terminal end.Therefore, in ribosomal frameshifting a directed change of translationalreading frame allows the synthesis of a single protein from two or moreoverlapping genes. The internal translational initiation consists inescaping from an upstream initiator codon according to differentmechanisms including leaky-scanning and ribosome shunting and internalribosome entry site. Such a mechanism is apparently used for in vivoexpression of shorter form core+1 protein.

The invention thus provides a new protein of HCV life cycle, which isdesignated shorter form core+1 protein and which can be obtained by invivo expression of the core+1 coding sequence or ORF, especially inmammalian cells.

The invention also relates to nucleic acid sequences encoding saidshorter form core+1 protein.

The invention also provides methods for detecting in a biological sampleof an individual the presence or absence of the shorter form core+1protein giving evidence of Hepatitis C virus infection.

The invention also provides use of the shorter form core+1 protein ofthe invention in an immunogenic composition. An immunogenic compositionof the invention may advantageously be prepared in order to elicit a CTLresponse against HCV infection, in a patient.

The shorter form core+1 HCV protein may also be involved in thepreparation of therapeutic composition aiming at interacting with theconsequences of HCV infection, especially when persistent infectionappears.

The invention also provides means for screening compounds, especiallycompounds having antiviral activity, as a result of interaction with invivo expression of the core+1 ORF directing translation of shorter formcore+1 protein. Among the several advantages of the present methods, itshould be noted that these screening methods are appropriate for routinehigh throughput screening of compounds capable of interacting with viralpropagation and control of life cycle of the virus especially capable ofinhibiting or preventing viral propagation.

Moreover, the invention also provides for the use of the compoundscapable of interacting with viral propagation and control of life cycleof the virus, especially compounds capable of inhibiting or preventingviral propagation, advantageously as a result of their capacity tointeract with expression of shorter form core+1 protein in HCV infectedcells, which compounds would be useful for the preparation of a drug forthe treatment of disorders induced by or associated with infection ofHepatitis C virus.

A first object of the invention is thus a shorter form core+1 protein ofHCV which is the product of translation of a coding sequence consistingof all or part of nucleotide sequence extending from nucleotide 598 tonucleotide 920 within the core +1 ORF of HCV represented on FIG. 3B.

In a particular embodiment, the shorter form core+1 protein which isencoded by a nucleotide sequence having a translation initiation codon(ATG) at position 598 or by a nucleotide sequence having an ATG atposition 606 of the HCV core+1 coding sequence.

In a particular embodiment, the shorter form core+1 protein is encodedby:

-   -   (i) a nucleotide sequence extending from nucleotide 598 to        nucleotide 826 of the sequence represented on FIG. 3B; or    -   (ii) a nucleotide sequence extending from nucleotide 598 to        nucleotide 897 of the sequence represented on FIG. 3B; or    -   (iii) a nucleotide sequence extending from nucleotide 606 to        nucleotide 826 of the sequence represented on FIG. 3B; or    -   (iv) a nucleotide sequence extending from nucleotide 606 to        nucleotide 897 of the sequence represented on FIG. 3B; or    -   (v) a nucleotide sequence extending from nucleotide 606 to        nucleotide 920 of the sequence represented on FIG. 3B.

As used herein, the expression “shorter form core+1 protein”, or “invivo core+1 protein” refer to the Hepatitis C virus proteins obtainablein vivo, in cells infected with HCV, or in cells transfected with a DNAconstruct comprising core coding sequence or core+1 ORF. A predominantshorter form of core+1 is especially produced in vivo which is smallerthan the 16/17 kDa core+1 in vitro synthesized product, as it ispredicted to have a calculated molecular weight of less than 10 kDa.Furthermore, the shorter form core+1 protein does not contain the first10 consecutive A residues of the core protein. These A residues arelocated codons 8-11 (nucleotides 364-373) of the HCV-1 genome and have agreat importance on the expression of the core+1 ORF. This specificdifference of molecular weight explains the term “shorter form core+1protein”.

As used herein, the expression “core +1 ORF” refers to the nucleotidesequence such as represented FIG. 3B of the present application which iscomprised within the “core coding sequence” of HCV. Said core +1 ORF,begins at nucleotide 342 with translation initiation codon and extendsup to nucleotide at position 920 (U.S. Ser. No. 09/644,987) in thesequence illustrated on FIG. 3B.

It is pointed out that shorter form core+1 protein is encoded by core+1ORF or by core coding sequence, when said nucleotide sequences areexpressed in vivo.

The invention relates further to a shorter form core+1 protein of HCVwhich is obtainable in vivo by expression of the core+1 open readingframe (ORF) which is contained in nucleotide sequence extending fromnucleotide at position 342 to nucleotide at position 920, preferably tonucleotide at position 826 of the nucleotide sequence represented onFIG. 3B and which calculated molecular weight is less than 10 kDa.

It is emphasized that shorter form core+1 protein is obtainable in vivoindependently of the expression of the HCV polyprotein and alsoindependently of the expression of core+1 protein. Said expression invivo uses the same frame as the one used for core+1 expression in thecore+1 ORF but does not involve the frameshift transfection mechanismrequired for core+1 in vitro expression.

In an other embodiment, the shorter form core+1 protein is theexpression product of the core+1 ORF in mammalian cells.

In a preferred embodiment, the shorter form core +1 protein isrecognized by a serum of patients infected with HCV. In the same waycirculating anti-core+1 antibodies have been detected in HCV-infectedindividuals, suggesting that this protein is produced during natural HCVinfection.

In a preferred embodiment, the shorter form core+1 protein comprises theamino acid sequence extending from amino acid residue corresponding tonucleotide 598 to amino acid residue corresponding to nucleotide 826, orto nucleotide 897 or to nucleotide 920 of the sequence represented onFIG. 3B. In another preferred embodiment, the shorter form core+1protein comprises the amino acid sequence extending from amino acidresidue corresponding to nucleotide 606 to amino acid residuecorresponding to nucleotide 826, or to nucleotide 897 or to nucleotide920 of the sequence represented on FIG. 3B.

The start and/or stop codons disclosed for shorter core+1 protein mayvary depending on the HCV isolate considered. The above positions ofstart and stop codons are given with respect to the amino-acid sequenceof FIG. 3. Although shorter form core+1 protein ending with codoncorresponding to nucleotide 826 can be regarded as a preferred form ofsaid protein, the above given longer sequences may be encodedsimultaneously or alternatively.

The invention further concerns peptides contained within the shorterform core+1 protein, especially peptide useful as epitopes. For thepurposes of the present invention, the term “epitope” when referring toa peptide is to be considered as an antigenic determinant or theimmunologically active region of said peptide. It is the portion of saidimmunogenic peptide which is bound specifically by antibody or TCR. Saidepitope on a peptide antigen may involve elements of the primary,secondary, tertiary, an even quaternary structure of the peptide andcontains at least three residues. The present invention provides aparticular peptide of interest, useful as epitope, and having thefollowing sequence:

-   -   COOH-T-Y-R-S-S-A-P-L-L-E-A-L-P-G-P-NH₂

Such peptide of interest comprises amino-acid sequence extending fromamino-acid residue corresponding to nucleotide 749 to amino-acid residuecorresponding to nucleotide 793, or to nucleotide 796 in the sequence ofFIG. 3B.

Variants of this peptide, such as those obtained by deletions, additionsor substitutions of amino acids in the peptide, are also encompassed bythe present invention and can be obtained by methods known in the art,as long as these variants can elicit antibodies or can immunologicallyreact with antibodies directed against the above sequence.

Examples of variants of this peptide of interest encompassed by thepresent invention can be illustrated as follows and according to FIG. 8:

-   -   COOH-T-X-R-S-S-A-P-L-L-E-A-L-P-G-P-NH ₂ where X=F or S;    -   COOH-T-Y-X-S-S-A-P-L-L-E-A-L-P-G-P-NH ₂ where X=L, P or R;    -   COOH-T-Y-R-S-X-A-P-L-L-E-A-L-P-G-P-NH ₂ where X=L;    -   COOH-T-Y-R-S-S-X-P-L-L-E-A-L-P-G-P-NH ₂ where X=V;    -   COOH-T-Y-R-S-S-A-P-X-L-E-A-L-P-G-P-NH ₂ where X=P or R;    -   COOH-T-Y-R-S-S-A-P-L-X-E-A-L-P-G-P-NH ₂ where X=S or W;    -   COOH-T-Y-R-S-S-A-P-L-L-X-A-L-P-G-P-NH ₂ where X=G, V, A, or E;    -   COOH-T-Y-R-S-S-A-P-L-L-E-A-X-P-G-P-NH ₂ where X=S;    -   cooH-T-Y-R-S-S-A-P-L-L-E-A-L-X-G-P-NH ₂ where X=Q;    -   cooH-T-Y-R-S-S-A-P-L-L-E-A-L-P-X-P-NH ₂ where X=E;    -   COOH-T-Y-R-S-S-A-P-L-L-E-A-L-P-G-X-NH ₂ where X=L or H;    -   COOH-T-Y-R-S-S-A-P-L-L-E-A-L-P-G-P-X-NH ₂ where X=C, W or S.

Such peptides are interesting especially for the preparation ofantibodies, either polyclonal or monoclonal.

The translation initiation codon of shorter form core+1 protein may varydepending on HCV isolate. Some isolates contain two ATG which both maybe used for synthesis of shorter form core+1 protein. Other isolatescontain only one ATG for said protein.

Various shorter form core+1 proteins are for example derivable from theproteins alignment of the sequence of FIG. 3B, with the amino-acidsequences disclosed in FIG. 8, which correspond to the proteinsexpressed by variants.

The invention also concerns a mosaic of proteins encoded by the abovedefined core coding sequence of HCV. Such a mosaic contains at least twoproteins selected among core protein, core+1 protein, shorter formcore+1 protein or their derivatives, including derivatives encoded bysaid sequence and involving further frameshift mechanism in the 3′terminal part of the core coding sequence.

These compositions of proteins can comprise proteins of the sameisolates or from different HCV isolates.

The invention also relates to a nucleotide sequence consisting in afragment of the nucleotide sequence extending from nucleotide 342 tonucleotide 920 represented on FIG. 3B, which fragment is capable ofencoding a shorter form core+1 protein of HCV when transfected inmammalian cells under expression conditions.

More specifically, it is shown that the nucleotide sequence encoding ashorter form core+1 protein comprises a nucleotide sequence extendingfrom nucleotide 598 or from nucleotide 606 to nucleotide 826 within thecore+1 coding sequence of FIG. 3B.

In a specific embodiment, the nucleotide sequence encoding a shorterform core+1 protein is chosen among:

(i) a nucleotide sequence extending from nucleotide 606 to nucleotide826 of the sequence represented on FIG. 3B;(ii) a nucleotide sequence extending from nucleotide 606 to nucleotide897 of the sequence represented on FIG. 3B;(iii) a nucleotide sequence extending from nucleotide 606 to nucleotide920 of the sequence represented on FIG. 3B;(iv) a nucleotide sequence extending from nucleotide 598 to nucleotide826 of the sequence represented on FIG. 3B;(v) a nucleotide sequence extending from nucleotide 598 to nucleotide897 of the sequence represented on FIG. 3B;(vi) a nucleotide sequence extending from nucleotide 598 to nucleotide920 of the sequence represented on FIG. 3B;(vii) a fragment of sequence (i), (ii), (iii), (iv), (v), or (vi) whichis capable of encoding a shorter form core+1 protein as defined above,in mammalian cells or an epitope thereof.

The invention also provides variant nucleotide sequences derived fromdifferent isolates, which encode the shorter form core+1 proteinsillustrated on FIG. 8.

The invention thus provides a nucleotide sequence comprising a HepatitisC virus core protein coding sequence which is derived from thenucleotide sequence represented on FIG. 3B as a result of one or severalmutation selected among the following:

-   -   in 9^(th) and 11^(th) codons a mutation which respectively        corresponds to a triple substitution of two A to G and of an A        to C; or    -   in 9^(th), 10^(th) and 11^(th) codons a mutation which        respectively consists of a substitution of one A to G and two A        to C; or    -   in 9^(th) codon a mutation which consists of a substitution of A        to G; or    -   in 10^(th) codon a mutation which consists of a substitution of        A to C; or    -   a substitution of an initiator codon into a terminator codon; or    -   a substitution of the 25^(th) codon into a stop codon; or    -   a substitution of the 43^(rd) codon into a stop codon; or    -   a substitution of the 79^(th) codon into a stop codon; or    -   a substitution of the 87^(th) codon into a stop codon; or    -   a substitution of the 85^(th) codon into a stop codon and/or a        substitution of the 87^(th) codon into a stop codon.

The nucleotide sequences of the invention are especially under apurified form, i.e., they are isolated from their natural environment inHCV.

The above mutations consist in generating a missense codon in a specificposition of the nucleotide sequence, for detecting codons which arecritical for core+1 expression in vivo. As referred to herein, a “stopcodon”, “missense codon”, “nonsense codon”, “terminator codon” and“chain-terminating codon” are capable of stopping the translation sinceany amino acids correspond to said codon. The coding sequences of stopcodon are often UAA, UAG and UGA. The presence of a stop codon disruptsthe core+1 coding region and fails to support the production of thecore+1 protein whether said stop codon is in frame with the open readingframe (ORF) of core+1 coding region.

The invention also provides nucleotide sequences which are functionalvariants of said nucleotide sequences and having at least 70% identity,preferably 80% or 90% identity.

As used herein, the term “variant” refers to a nucleotide sequencesubstantially homologous to HCV shorter form core+1 nucleotide sequence,which has however undergone mutations, especially one or more deletions,insertions or substitutions resulting of the genetic code degeneracy.The variant nucleotide sequence is at least 70% identical to anucleotide sequence encoding native shorter form core+1 protein, mostpreferably at least 80% or 90% identical. Determination of variantsequences according to the present invention may be performed using theGAP computer program (Devereux et al, Nucl. Acids Res. 12: 387, 1984).Variants can especially comprise conservative substitutions such thatphysiochemical characteristics of the mutated sequence are substantiallyidentical to those of the native shorter form core+1 nucleotidesequence. Variants can also be chosen for their capacity to encode avariant shorter form core+1 protein which is recognized by antibodiesdirected against native shorter form core+1 protein.

Particular variants of the nucleotide sequence of FIG. 3B are derivablefrom amino-acid sequences of FIG. 8, including considering thevariability resulting from degeneracy of the genetic code.

The invention also concerns a nucleotide sequence, such isolated DNA orRNA sequence, which hybridises under stringent conditions to anucleotide sequence disclosed herein. Those nucleotide sequencescomprise also complementary sequences of nucleotide sequences encoding amosaic of proteins comprising core protein, shorter form core+1 protein,fragment of shorter form core+1 protein and their derivatives.

The invention also relates a nucleotide sequence which hybridises understringent conditions with at least a complementary sequence of anynucleotide sequence disclosed above.

The invention also concerns nucleotides sequences, complementary to asequence selected among the above disclosed sequences.

As used herein, the expression “stringent conditions” refers toconditions of severe stringency such as defined by Sambrook et al. inMolecular Cloning: a laboratory manual (1989). These conditions of highstringency are defined as following hybridisation conditions: use aprewashing solution for the nitrocellulose filters 5× SSC, 0.5% SDS, 1.0mM EDTA (pH 8.0), hybridisation conditions of 50% formamide, 6× SSC at42° C. and washing conditions at 66° C., 0.2×SSC and 0.1% SOS. Protocolsare known to those having ordinary skill in the art. Moreover, theskilled artisan will recognize that the temperature and wash solutionsalt concentration can be adjust as necessary according to experimentalconstraints.

The invention provides a chimeric gene which comprises a promoteroperatively linked to a nucleotide sequence such as described above.Promoters sequence commonly use is selected from the group consisting oflactose promoter system, tryptophan promoter system, tac promoter andCMV promoter.

As stated herein, a “chimeric gene” or “recombinant gene” consists in aDNA molecule resulting from combination or linkage of two DNA sourcestogether, said sources not found naturally together.

The shorter form core+1 protein is “operatively linked” to a promotersuch as allow a suitable transcription and translation of said shorterform core+1 sequence, said regulatory sequence is thus functionallyrelated to the shorter form core+1 DNA sequence.

In a particular embodiment, the chimeric gene comprises a CMV/T7chimeric promoter and a chloramphenicol acetyl transferase (CAT) gene ina first cistron and the entire internal ribosome entry site (IRES) ofthe HCV core coding sequence and part of the wild type core codingsequence fused to LUC gene in a second cistron. The dicistroniccassettes CAT-IRES-core-LUC were placed under the control of a CMV/T7chimeric promoter, wherein CMV comes from Cytomegalovirus and T7 frombacteriophage, to allow the use of the same DNA plasmid for expressionin vivo and in vitro. Furthermore, to eliminate the possibility ofinternal translation initiation events triggered by the initiator codonof the LUC gene, the ATG was changed to a GGG codon. With the abovemodification, the expression of the LUC gene is directly related to theexpression of the fused core and coding sequences. Moreover, CATactivity series as an internal control to standardize transfectionefficiencies h vivo or potential variations in the transcript abundancein vitro.

In a preferred embodiment, the chimeric gene is constructed such as LUCgene is fused to the core sequence in a 0, +1 or −1 frames.

The open reading frame (ORF) is in frame 0, +1 or −1 such as notinterrupted translation by termination codon in order to obtain apolypeptide chain, i.e. LUC polypeptide in this specific case.

It is an other object of the invention to provide a vector comprising achimeric gene such described above. Vector according to the invention isparticularly appropriate to transfer DNA sequences, e.g. chimeric geneand has properties allow protein expression.

This vector is a plasmid, a cosmid a phage or a virus. In a preferredembodiment, the vector is a plasmid selected from the group consistingof pHPI-1333 and pHPI-1335 represented in FIG. 1.

The invention also relates to recombinant cells especially mammaliancells transfected with a nucleotide sequence of the invention. By“transfection” or “transformation” it is understood the introduction ofDNA into a recipient cell and its subsequent integration into saidrecipient cell chromosomal DNA. Methods of transfection ortransformation are usual methods well known in the art, e.g.electroporation. Transfection can be either transient or stable.

Recombinant cells, which are transfected or transformed by vector, arepreferably animal, mammalian or human cells. In a particular embodiment,recombinant cells are BHK-21 or Huh-1.

The invention also provides antibodies which are raised against shorterform core+1 protein or against a peptide thereof.

The invention relates to purified antibodies which specifically bind, toshorter form core+1 protein, i.e., without cross-reacting with coreprotein and/or core+1 protein.

As used herein, the “cross reaction” is a serological reaction in whichantibodies against one antigen react with a non-identical butimmunologically closely related antigen. In the present invention,antigen is the shorter form core+1 protein which has a polypeptidesequence close to core protein and/or core+1 protein.

Thus antibodies binding specifically to shorter form core+1 protein maybe directed against epitopes of said protein which are not present orwhich are not exposed in other proteins, especially in HCV core proteinor HCV core+1 protein.

The invention also concerns purified antibodies which specifically bindto polypeptide fragments common for shorter form core+1 protein, core+1protein and optionally core protein. In a preferred embodiment, thecommon polypeptide fragments of the above proteins are comprised betweennucleotide 897 to nucleotide 920. A preferred polypeptide fragment isillustrated above.

Monoclonal antibodies directed against the antigen of the invention canalso be prepared. Methods of production monoclonal antibodies are usualmethods well known in the art, including preparation of hybridoma andisolating the produced monoclonal antibodies having the required bindingaffinity.

Within an aspect of the invention, shorter form core+1 protein is usedto prepare antibodies that specifically bind to the shorter form core+1protein.

The invention also relates to a purified polypeptide which specificallybinds to at least one purified antibody which binds itself to shorterform core+1 protein or polypeptide fragments thereof or to an antibodyproduced by a method using the shorter form core+1 protein orpolypeptide fragments thereof as antigen. As used herein, the expression“purified polypeptide” refers to the shorter form core+1 protein ofHepatitis C virus representing at least 70%, preferably 80% or 90% ofpolypeptides recognized by purified antibody illustrated above.

Methods to detect an infection by Hepatitis C virus have been describedbelow. It is possible to detect in a biological sample the infection ofan individual by Hepatitis C virus by determining the presence orabsence of the shorter form core+1 protein. In a preferred embodiment,the shorter form core+1 protein is detected with antibodies which areimmunologicaly reactive with said protein by forming an antigen-antibodycomplex. Method for determining the amount of antibody bound to anantigen are well known in the art. For example, the antibody may carry adetectable marker, then a standard curve may be generated using knownamounts of the tested antigen and the amount of signal generated by themarker.

The invention also relates to a method for the in vitro detection ofinfection by Hepatitis C virus, which consists in detecting antibodiesrecognizing the shorter form core+1 protein in a biological sample. Theshorter form core+1 protein can be used as antigens to identifyantibodies recognizing said shorter form in materials and to determinethe concentration of the antibodies in those samples.

As stated herein, biological samples of an individual include but arenot limited to biological fluids as urine and blood samples or tissueand cells.

In a specific embodiment, these methods employ the formation ofantigen-antibody complex for the detection of HCV by means ofimmunoassay (direct detection) or ELISA (indirect detection). The use ofimmunoassay or observation of antigen-antibody by secondary reactions iswell known in detecting and quantifying humoral components in fluids.

The invention also relates to immunogenic compositions, comprising HCVshorter form core+1 protein or a peptide thereof, including a specificimmunogenic peptide.

Such compositions are useful to raise an immune response, either anantibody response and or preferably a CTL response in patients.Advantageously, the CTL response is such that it protects a patientagainst HCV infection or against its consequences. The nucleic acidsequences may also, alternatively, be involved in the preparation ofimmunogenic compositions.

The invention also concerns the shorter form core+1 protein for use intherapeutic compositions for the treatment of HCV infection or itsconsequences. Interestingly, the shorter form core+1 protein mayinterfere with the viral life cycle and especially down regulate HCVproliferation in patients.

The invention further relates to a method of screening compounds fortheir capacity to interact with viral propagation in cells infected byHCV, said method comprising:

-   -   a) contacting said cells with a candidate compound;    -   b) determining interaction between said candidate compound and        expression of said shorter form core+1 protein.

The invention also concerns a method of screening compounds whereininteraction is determined by measuring the expression level of shorterform core+1 protein, prior and after contacting the HCV infected cellswith a candidate compound.

Alternative translation mechanisms are used by viruses to regulate theproduction of structural and enzymatic proteins and, ultimately to viruspropagation. Altering these translation mechanisms disrupt the viruslife cycle and interact with virus production by eliminating or reducingsaid virus is propagation. Therefore, alternative mechanisms translationprovide a major target on which antiviral agents can act.

Translation mechanisms is an attractive target to identify agents thataffect the efficiency of these processes. Indeed, change in initiationtranslation (non-ATG codon, modification of start codon) can have largeeffect on virus production. Furthermore, compounds that change theefficiency of translation mechanisms function at low concentrationsequivalent to therapeutic concentration which do not disturb thetranslational machinery or cells infected by HCV virus for example.

As used herein, the term “compound” refers to inorganic or organicchemical or biological compounds either natural (isolated) or synthetic,and especially encompass nucleic acids, proteins, polypeptides,peptides, glycopeptides, lipids, lipoproteins and carbohydrates.

Blocking the translation of the core ORF has a positive effect on thetranslation of the level of expression of the core+1 ORF. In a preferredembodiment, the translation is two-fold increase whether the initiatorATG codon of the HCV polyprotein is converted into a stop codon. Suchmutation will block core expression but also increase level of core+1 invivo expression.

Any cells in which nucleotide sequences, chimeric genes and vectors maybe transfected can be used in the screening method of the invention. Ina preferred embodiment of the invention, transfected cells are animal,mammalian or human cells. In a further preferred embodiment, cells whichare used in the methods of screening compounds are BHK-21 or Huh-1.

The invention also concerns the compounds identified as a result ofcarrying out the above methods of screening.

Such compounds identified by the above methods of the invention areuseful for the treatment of disorders induced by or associated with aninfection by Hepatitis C virus.

These compounds, selected according to the above screening methods canbe used for the preparation of a drug for the treatment or disordersinduced by or associated with infection of HCV.

Some of these compounds may modulate disorders induced by or associatedwith infection of HCV by restoring or improving translation of theshorter form core+1 protein. Examples of disorders resulting in HCVproliferation in host is cirrhosis, hepatocellular carcinoma or diseaseassociated to liver chronic infection.

LEGENDS TO THE FIGURES

FIG. 1. Expression analysis of the core+1-LUC chimeric gene.

Panel A: Schematic representation of the CAT-LUC dicistronic constructsused for the tagging experiments. The entire IRES (nucleotides 9-341)and part of the core coding sequence (nucleotides 342-630) from HCV-1and HCV-1a (H) were fused with the LUC gene under the control of bothCMV and T7 promoters of pHPI-1046. The nucleotide sequences of thejunction between the core and luciferase coding regions are illustrated.The first codon of luciferase cistron, GGG, derived from the ATGinitiator by site-directed mutagenesis, is boxed. The LUC gene was fusedin the 0 frame relative to the preceding core coding sequence inpHPI-1331 (HCV-1) and pHPI-1334 [HCV-1a (H)], in the +1 frame inpHPI-1333 (HCV-1) and pHPI-1335 [HCV-1a (H)], and in the −1 frame inpHPI-1332 (HCV-1) and pHPI-1336 [HCV-1a (H)]. The underlined nucleotideindicates an insertion of a thymidine residue, and the inverted triangleindicates a deletion of an adenine residue.Panels B, C: In vivo (a) and in vitro (b) expression of the HCV-1 (B)and HCV-1a (H) (C) fusion constructs.

-   -   (a) Duplicate cultures of BHK-21 cells were transfected with        each construct and the relative ratio of LUC activity to CAT        quantity was determined. Bars represent the means obtained in        two separate experiments in duplicate. Error bars represent the        standard deviation.    -   (b) Each construct was transcribed in vitro and equal amounts of        all RNAs were translated in Flexi rabbit reticulocyte lysates.        Translation products were directly separated by SDS-PAGE and        analyzed by autoradiography. Fusion proteins are indicated by        filled arrowheads. Open arrowheads show the CAT protein. NC        means negative control.

FIG. 2. Effect of mutations within codons 8-11 of the HCV-1 (N1B, N19)and HCV-1a (H) (N15, M16) core coding sequence on the expression ofcore+1-LUC chimeric gene.

Panels A, B: The core nucleotide sequences in the region of codons 8-11of the wildtype HCV-1 (A) and HCV-1a (H) (B) plasmids, as well as of thecorresponding mutant variants N18; N19 (HCV-1) (A) and N15, N16 [HCV-1a(H)] (B). The wildtype sequences of codons 8-11 are shown in bold. Thearrows indicate the inserted mutations and the bold characters indicatethe mutated nucleotides and affected amino acids. The numbers inbrackets indicate the number of the mutated codons.Panels C, D: The HCV-1 (C) and HCV-1a (H) (D) wild-type (pHPI-1333 andpHPI-1335, respectively) and corresponding mutants [pHPI-1382 (N18),-1383 (N19), and pHPI-1395 (N15), -1396 (N16), respectively] were usedto transfect BHK-21 cells (a) or transcribed in vitro and equal amountsof RNAs were translated in Flexi rabbit reticulocyte lysates (b).

-   -   (a) Duplicate cultures of BHK-21 cells were transfected with the        wild-type or the mutated constructs. The activity of each mutant        was calculated by determining the ratio of LUC activity to CAT        quantity and is expressed as a percentage of that of the        wild-type. Bars represent the means observed for three separate        experiments each carried out in duplicate. Error bars correspond        to the standard deviation.    -   (b) 5λ of the [35S]-methionine-labeled in vitro translation        products were separated by 12% SDS-PAGE and analyzed by        autoradiography. The fusion protein core+1-LUC is indicated by        the filled arrowhead. Open arrowheads show the CAT protein. WT        and NC respectively mean wild-type and negative control.

FIG. 3. Mutational analysis of core/core+1 coding region. Nucleotidesequence of the HCV-1 core coding region including mutations affectingthe 0 (A) and +1 (B) open reading frames (ORFs). Inserted mutations areindicated by arrows. Mutated nucleotides and affected amino acids areshown in bold.

FIG. 4. Effect of mutations within nucleotide sequences that flankcodons 8-11 of the HCV-1 (A) and the HCV-1a (H) (B) core coding regionon the expression of core+1-LUC chimeric gene.

The wild-type pHPI-1333 (HCV-1) and pHPI-1335 [HCV-1a (H)] and N3, N6mutant variants pHPI-1343, pHPI-1344 (HCV-1) and pHPI-1346, pHPI-1347[HCV-1a (H)] respectively, were used to transfect BHK-21 cells (a) ortranscribed in vitro and equal amounts of RNAs were translated in Flexirabbit reticulocyte lysates (b).(a) Duplicate cultures of BHK-21 cells were transfected with thewild-type or the mutated constructs. The relative activity of eachmutant variant was determined as described in the legend of FIG. 2. Barsrepresent the means from two separate experiments each performed induplicate. Error bars indicate the standard deviation.(b) Translation products were resolved by SDS-PAGE and analyzed byautoradiography. Filled and open arrowheads show the chimeric core+1-LUCand CAT proteins, respectively. WT and NC respectively mean wild-typeand negative control.

FIG. 5. Mutational analysis within the core+1 coding sequence of HCV-1and HCV-1a (H) isolates.

The HCV-1 (A, C) and HCV-1a (H) (B, D) wild-type (pHPI-1333 andpHPI-1335, respectively) and mutated plasmids [pHPI-1342 (N1), -1380(N21), -1381 (N22) and pHPI-1345 (N1), -1398 (N21), -1397 (N22),respectively] were expressed in BHK-21 (a) and Huh-7 (b) cells or inFlexi rabbit reticulocyte lysates (c).(a) and (b) The experiments were carried out in duplicate and repeatedat least twice. The relative activity of each mutant variant wasdetermined as described in the legend of FIG. 2. Bar represent themeans. Error bars correspond to the standard deviation.(c) Translation products were separated by SDS-PAGE and analyzed byautoradiography. The positions of the hybrid core+1-LUG and the CATproteins are indicated by filled and open arrowheads, respectively. WTand NC respectively mean wild-type and negative control.

FIG. 6. Effect of mutations targeting codons ATG598 and ATG604 of thecore+1 coding sequence.

Duplicate cultures of BHK-21 (A) and Huh-7 (B) cells were transfectedwith the dicistronic HCV-1 wild-type (pHPI-1333) and mutated constructs:pHPI-1399 (N23), pHPI-1400 (N24) and pHPI-1401 (N25). The relativeactivity of each mutant variant was calculated as described in thelegend of FIG. 2. Bars represent the means from two separate experimentseach carried out in duplicate. Error bars indicate the standarddeviation. WT and NC respectively mean wild-type and negative control.

FIG. 7. Expression of the chimeric core+1-LUC protein in transfectedcells.

Panel A: Schematic representation of the monocistronic constructspHPI-1362 (core+1-LUC) and pHPI-1363 (core-1-LUC).Panel B: Duplicate cultures of BHK-21 cells were transfected with themonocistronic core+1-LUC construct pHPI-1362 or the dicistroniccore+1-LUC pHPI-1333 and the relative luciferase activity wasdetermined. Bars represent the means from two separate experiments.Error bars represent the standard deviation.Panel C: Immunoprecipitation of [35S]-methionine-labeled translationproducts of the core+1-LUC and core-1-LUC containing monocistronicconstructs from transiently transfected BHK-21 cells using an anti-LUCgoat polyclonal antibody. The immunoprecipitates were analyzed bySDS-PAGE followed by autoradiography. The hybrid core+1-LUC proteinproduced in vivo is marked by a dot. The open arrowhead shows the[35S]-methionine-labeled core+1-LUC protein synthesized in rabbitreticulocyte lysates. NC means negative control.

FIG. 8. Variability of the shorter form core+1 coding sequence amongdifferent variants of HCV.

FIG. 9. List of oligonucleotides and constructs used in the mutationalanalysis.

EXAMPLES 1. Materials and Methods

1.1 Site-Directed Mutagenesis and Plasmid Construction

Site-directed mutagenesis was carried out using the Quikchange™ kit(Stratagene). The templates and oligonucleotides used in the mutationalanalysis and the corresponding mutants are listed in FIG. 9. Allmutations were confirmed by sequence analysis.

The HCV-1 cDNA sequences thus obtained were cloned in pHPI-888 describedin Varaklioti et al. 2002. The HCV-1 cDNA is obtained from pHPI-888 byPCR using sense primer, named HCVF17 (nucleotides 9-27),5′-CGCCGGATCCTGATGGGGGCGACACTCCAC-3′ plus antisense primer, namedHCVR38. (nucleotides 342-322) 5′-CGCCGGATCCGGTGCACGGTCTACGAGACC-3′ andusing sense primer, named HCVF36 (nucleotides 268-287),5′-CGCCGGATCCGGTCGCGAAAGGCCTTGTGG-3′ plus antisense primer, named HCVR27(nucleotides 1052-1030), 5′-CGCCGGATCCTCGAGGCGTTGCCCTCACGA-3′. PlasmidspHPI-888 is based on the pGEM-3zf(+) vector (Promega) and contains cDNAsequence (nucleotide 9-1054) from the HCV-1 isolate (IRES-core HCV-1sequence: accession No. M62321).

The HCV-1a (H) cDNA sequence is obtained by PCR using primers belowamplifying sequence of HCV-1a (H) from pDNA-C1. The pDNA-C1 plasmid iscreated by insertion of the first 1064 nucleotides of HCV strain H intothe vector pcDNA3 (Invitrogen). The cloned sequence included the 5′-NCR(nucleotides 1 to 341), the nucleocapsid coding sequence (nucleotide 342to 914), and 150 nucleotides encoding the first 50 amino acids of theenvelope E1 (nucleotides 915 to 1064) (Inchauspe et al. 1991, IRES-coreHCV strain H sequence: accession number No. M67463).

The dicistronic constructs pHPI-1331, -1333 and -1332 contain thechloramphenicol acetyl transferase (CAT) gene as the first cistronfollowed by the entire IRES and part of the wild-type core codingsequences (nucleotides 9-630) from the prototype HCV-1 isolate fused tothe firefly LUC gene in the 0, +1 and −1 frames, respectively. They wereproduced by site-directed mutagenesis from dicistronic pHPI-1311, -1313and -1312, respectively, using primers 5′-TGGATCCAAGGGGAAGACGCC-3′(sense) and 5′-GGCGTCTTCCCCTTGGATCCA-3′ (antisense). This set of primersconverts the start codon of the luciferase coding region (ATG) into aglycine codon (GGG). pHPI-1311, -1313 and -1312 were constructed byreplacing the 203-bp NheI-XbaI fragment of the dicistronic pHPI-1046(Psaridi et al. 1999) carrying nucleotides 249-407 of the IRES-coresequences fused with part of the LUC gene, with the 435-bp NheI-XbaIfragments of pHPI-766, -767, -768 (Varaklioti et al. 2002), containingnucleotides 249-630 of the IRES core sequences fused with the first 50nucleotides of the LUC gene. The dicistronic constructs pHPI-1334, -1335and -1336 carry the entire IRES and part of the wild-type core codingregion (nucleotides 9-630) from HCV-1a (H) fused to the LUC gene in allthree frames (0, +1 and −1 respectively). They were derived bysite-directed mutagenesis using pHPI-1328, -1329 and -1330, respectivelyas templates and primers 5′-TGGATCCAAGGGGAAGACGCC-3′ (sense) and5′-GGCGTCTTCCCCTTGGATCCA-3′ (antisense). The primers change theinitiator codon of the luciferase coding region into a glycine codon(ATG→GGG). pHPI-1328, -1329 and -1330 were generated by replacing the203-bp NheI-XbaI fragment of pHPI-1046 with the 435-bp NheI-XbaIfragments of pHPI-748, -749, -750 (Varaklioti at al. 2002),respectively. To facilitate the characterization of plasmids, most ofthe mutations inserted into the dicistronic constructs pHPI-1333 (HCV-1)and pHPI-1335 [HCV-1a (H)] were recloned into pHPI-1046 by replacing thewild-type 203-bp NheI-XbaI fragment with the corresponding fragment ofthe mutated template. The monocistronic constructs pHPI-1362 and -1363contain the same IRES-core-LUC cassette as pHPI-1333 and -1332respectively, cloned between the HindIII and SalI sites of pEGFPN3(Clontech).

1.2 In Vitro Translation

For all the plasmids, Flexi rabbit reticulocyte lysates (Promega)supplemented with 120 mM KCl and 0.5 mM Mg (OAc)₂ were used. DNA (3 μg)from each plasmid was linearized and transcribed in vitro with T7 RNApolymerase (Promega) according to the manufacturer's instructions.Wild-type pHPI-1331, -1333, -1332, -1334, -1335, -1336 and thecorresponding to mutated dicistronic constructs were linearized withPstI.

In vitro translation experiments were carried out on uncapped RNAs in atotal volume of 25 μl using [35S]-methionine (Amersham Biosciences). Thetranslation products (5 μl) were analyzed by 12% SDS-PAGE, transferredonto nitrocellulose membranes and visualized by autoradiography.

1.3 Cells and DNA Transfection

BHK-21 and Huh-7 cells were maintained in Dulbecco's modified Eagle'smedium (DMEM) supplemented with 10% fetal bovine serum (DMEM/FBS) at 37°C. in a 5% CO₂ incubator. Cells seeded in 6-well plates (60% confluence)were transfected with 1 μg plasmid DNA in the presence of lipofectaminePlus™ reagent (Invitrogen) according to the manufacturer's protocol(Invitrogen, Cat. No 10964-013). The medium was replaced with newDMEM/FBS 24 h post-transfection. The cells were washed twice withphosphatebuffered saline (PBS) 48 h post-transfection and lysed in 260μl of 1× luciferase lysis buffer (Promega). Firefly LUG was quantifiedby mixing 20 μl of extract with 100 μl of luciferase assay reagent(Promega) and measuring the luminescence directly with a Turner TD-20/20luminometer. In the case of the dicistronic constructs, CAT wasquantified with the CAT-ELISA kit according to the manufacturer'sinstructions (Roche, Cat. No. 1363727).

1.4 Antibodies Production

The goat polyclonal antibody against the firefly luciferase was obtainedwith the kit (Promega, Cat. No. G7451) from Promega Corporation at aconcentration of 1 mg/ml.

1.5 Immunopecipitation Analysis

Thirty-six hours after transfection with pHPI-1362 or pHPI-1363,monolayers of BHK-21 cells (˜107 cells) were metabolically labeled for12 h with 20 μCi [35S]-methionine (Amersham Biosciences) per mlmethionine-free medium supplemented with 1% FBS. The labeled cells wererinsed with PBS and lysed in 500 μl total volume of triple detergentbuffer consisting of 50 mM Tris (pH 8), 150 mM NaCl, 0.1% SDS, 1%Nonidet P-40, 0.5% sodium deoxycholate and 100 μg/ml phenylmethylsulfonyl fluoride (PMSF). Cell lysates were mixed by vortexing andcentrifuged at 14000×g for 10 min at 4° C. Clarified lysates wereincubated with 10 μl of anti-LUC polyclonal antibody on a rockerovernight at 4° C. Protein G PLUSAgarose (Santa Cruz Biotechnology) wasadded (20 μl) to this mixture and the reactions were incubated in thesame conditions for an additional 2 h. After microcentrifugation, theagarose beads were washed three times with a buffer containing 50 mMTris (pH 8), 150 mM NaCl, 0.1% Nonidet P40 and 1 mM EDTA. Theimmunoprecipitates were subsequently resolved by 10% SDS-PAGE,transferred onto nitrocellulose membranes and detected byautoradiography.

2. Results

2.1 Core+1 ORF is Efficiently Expressed in Transfected Cells (In Vivo)

It has been previously reported that in vitro assays show detectableexpression of the core+1 ORF only from the HCV-1 core coding region andnot from the HCV-1a (H) isolate.

The translation of the core+1 protein from the HCV-1 and HCV-1a (H)isolates in mammalian cells has been analysed. The expression of core+1ORF have been compared with in vitro system based on rabbit reticulocytelysates. The cDNA sequences containing the entire IRES and part of thecore coding sequences (nucleotide 9-630) from HCV-1 and HCV-1a (H) fusedto the LUC gene in all three frames were transferred into a dicistronicvector in which CAT was the first gene. The dicistronic cassettesCAT-IRES-core-LUC were placed under the control of a CMV/SP6 promoter toallow the use of the same DNA plasmid for expression in vivo and invitro. These constructs are illustrated in FIG. 1A of the presentapplication. The expression of the LUC gene is directly related to theexpression of the fused core or core+1 coding sequences, and CATactivity serves as an internal control to standardize transfectionefficiencies in vivo or potential variations in the transcript abundancein vitro. Each construct was transfected into BHK-21 cells andforty-eight hours later LUC and CAT activities were measured.

In the case of HCV-1, substantial amounts of luciferase were expressedfrom the core+1-LUC cassette of pHPI-1333, as the levels of theluciferase activity were similar to that of the core-LUC fusion proteinderived from pHPI-1331 (FIG. 1B[a]). Only background levels ofluciferase activity were detected from the expression of thecorresponding negative control core-1-LUC construct (pHPI-1332).Surprisingly, in contrast to in vitro, very high levels of luciferaseactivity were observed from construct pHPI-1335, which contains thecore+1 ORF from HCV-1a (H) fused to the LUC gene. The levels were about200% of the HCV-1a (H) core-LUC hybrid protein yielded from pHPI-1334(FIG. 1C[a]). The corresponding negative control plasmid (pHPI-1336)resulted in background levels of luciferase expression. Thus, the HCV-1core+1 LUC-tagged protein is efficiently produced in vivo, with similartranslation levels as the core coding sequence.

These results indicate that, in contrast to expression studies in rabbitreticulocyte lysates, the HCV-1a(H), and also HCV-1, isolatesefficiently express the core+1 ORF in transiently transfected BHK-21cells.

These results also reveal differences in the translation mechanismdirecting the expression of core+1 ORF in vitro and in transfectedcells.

2.2 A-Rich Sequence at Codons 8-11 of the Core Coding Sequence is notEssential for the Expression of the Core+1 Protein In Vivo

It has been shown that the core coding region of HCV-1a (H) lacks the 10consecutive A residues representing codons 8-11 (nucleotides 364-373) ofthe HCV-1 genome, which is a known slippery site for ribosomalframeshifting. The importance of the 10-A residue region on theexpression of the core+1 ORF in transfected cells was also analysed andcompared to that in rabbit reticulocyte lysates.

Mutational studies in 10-A residue region were also analysed in order todetermine their effects on the production of the core+1 protein in vivo:

for HCV-1 (FIG. 2A), insertion of mutation N18, which contains a triplesubstitution of two A to G and of an A to C at nucleotides 366, 367 and373 respectively (codons 9 and 11), gave rise to pHPI-1382, whereasmutation N19, which consists of an A to G and of two A to C change atnucleotides 367, 369 and 373 respectively (codons 9, 10 and 11),resulted in pHPI-1383;

for HCV-1a (H) (FIG. 2B), mutation N15, which consists of an A to Gchange at position 366 (codon 9), gave rise to pHPI-1395, and N16, whichcarries an A to C substitution at nucleotide 369 (codon 10), resulted inpHPI-1396. Both N15 and N16 mutations contain a single substitution asHCV-1a (H) isolate already carries a G and a C at positions 367 and 373;none of these mutations has a significant effect on luciferase activityin vivo.

This result suggests that the presence of the 10 consecutives adenines,at codons 8-11, which are only found in the HCV-1 isolate, is neithercritical for core+1 expression in vivo nor for the expression of core+1protein in rabbit reticulocyte lysates.

2.3 ATG Initiator Codon of the HCV Core Coding Sequence is not Essentialfor the Expression of the Core+1 Protein in Transfected Cells

The molecular mechanism implicated in the in vivo expression of thecore+1 ORF is further investigate according to two mutations introducedinto the core coding region of the core+1-LUC-tagged constructs of boththe HCV-1 and HCV-1a(H) isolates.

Mutation N3 converted the ATG initiator codon of the core ORF into aterminator codon and mutation N6 introduced a stop codon at the 25thposition of the core coding sequence at nucleotide 414 (P25, CCG). Theresulting plasmids were named pHPI-1343 and pHPI-1344 for HCV-1, andpHPI-1346 and pHPI-1347 for HCV-1a (H), respectively.

N3 and N6 mutations failed to block the core+1 expression in transfectedcells for both HCV-1 and HCV-1a (H) isolates. Furthermore, said N3 andN6 mutations have also increased levels of luciferase activity. On thecontrary and consistent with previous in vitro studies, the N3 mutationabrogated the synthesis of the 72 kDa core+1-LUC protein from HCV-1(FIG. 4A[b], lane 2), whereas N6 had no effect on the production of thecore+1-LUC chimeric protein (FIG. 4A[b], lane 3). Furthermore, asexpected according previous studies, the core+1-LUC constructs (WT, N3,N6) from HCV-1a (H) failed to produce detectable levels of the chimericprotein (FIG. 4B[b]).

These data show that differences appear between the predominanttranslation mechanism for core+1 expression between rabbit reticulocytelysates (in vitro) and transfected cells (in vivo). Furthermore, thesedata show that the expression of core+1 ORF in vivo does not require theexpression of core protein.

These results suggest that blocking the translation of the core ORF hasa positive effect on the translation of the core+1 ORF and thatribosomal frameshifting is not the predominant mechanism directingcore+1 expression in vivo.

2.4 Efficient Translation of the Core+1 ORF In Vivo is Mediated byInternal Initiation(s) Codon(s)

It has been shown in the above experiments that the expression of thecore+1 ORF in vivo is not suppressed by changes in the initiator ATG orthe A-rich region. Therefore, mutagenesis experiments have been carriedout in order to test if downstream codons may function as translationinitiation sites for the expression of the core+1 ORF.

To facilitate the description of the mutations affecting the core+1 ORF,the GCA alanine codon at nucleotide 346 is arbitrarily defined as thefirst codon of the core+1 ORF. Three nonsense mutations were separatelyinserted into the core+1 coding sequences of HCV-1 and HCV-1a (H):

-   -   mutation N1 introduced a TAG stop codon into the core+1 ORF at        nucleotide 472 (W43, TGG), resulting in pHPI-1342 and pHPI-1345,        respectively for HCV-1 and HCV-1a (H);    -   mutation N21 changed the 79th codon of the core+1 ORF at        nucleotide 580 (G79, GGT) to a TAG stop codon, resulting in        pHPI-1380 (HCV-1) and pHPI-1398 [HCV-1a (H)];    -   mutation N22 introduced a TAG terminator codon eight codons        downstream of mutation N21 at nucleotide 604 (M87), resulting in        pHPI-1381 (HCV-1) and pHPI-1397 [HCV-1a (H)].

Mutations N1 and N21 had no significant effect on the in vivo expressionof the core+1-LUC gene for both HCV-1 and HCV-1a (H) isolates. On thecontrary, the N22 mutation almost completely abolished the synthesis ofthe core+1-LUC protein from both HCV-1 and HCV-1a (H) isolates and bothBHK-21 and Huh-7 cell lines.

As expected according previous studies, mutations N1, N21 and N22 failedto support the in vitro expression of core+1 ORF.

These data show that efficient translation initiation of the core+1 ORFin transfected cells is mediated from downstream/internal initiationcodons that may be located about between nucleotides 530 and 604.

The region between nucleotides 583 and 606 (codons 80-87) contains twoATG (nucleotides 598-ATGNNNATG-606), which assess the functionalimportance of these ATG as initiation sites for the translation of thecore+1 protein in vivo. Three following mutations have been tested:

-   -   mutation N25 converted both methionines at positions 85 and 87        to glycines, resulting in pHPI-1401;    -   mutation N23 altered only M85 resulting in pHPI-1399;    -   mutation N24 altered only M87 resulting in pHPI-1400.

The transfection of BHK-21 and Huh-7 with mutants pHPI-1399 (N23) andpHPI-1400 (N24) yielded similar levels of luciferase translation as thewild-type construct. In contrast, mutation N25 severely affected theproduction of the chimeric core+1-LUC protein, which was about 23% ofthe wildtype level in BHK-21 cells and about 26% in Huh-7.

These results suggest that the two methionines (M85 and M87) of thecore+1 coding region are involved in core+1 expression since conversionof both of them to glycine significantly reduced the levels ofluciferase activity.

2.5 Comparison of Size of the Core+1 Protein Produced In Vivo and inVitro

IRES-core+1-LUC cassette contained in the dicistronic constructpHPI-1333 (HCV-1), as well as the corresponding negative controlIRES-core-1-LUC cassette of pHPI-1332 were transferred into amonocistronic expression vector under the control of a CMV promoter,resulting in pHPI-1362 and pHPI-1363 respectively (FIG. 7A). This systemimproves the detection of the luciferase protein, as HCV IRES is moreactive in monocistronic constructs. Specifically, the luciferaseactivity exhibited by the monocistronic IRES-core+1-LUC constructpHPI-1362 in BHK-21 cells forty-eight hours post-transfection was aboutnine-fold higher than that yielded from the respective dicistronicpHPI-1333. Immunoprecipitation experiments were carried out withextracts of BHK-21 cells transfected with pHPI-1362, using a goatpolyclonal antibody raised against luciferase. A protein with anapparent molecular mass of around 62 kDa reacted strongly with thepolyclonal antibody, this protein was clearly smaller than the chimericprotein core+1-LUG produced in vitro from the pHPI-1333 construct.

These results are consistent with above mutagenesis and show that thecore+1″ protein produced in mammalian cells are smaller by about 10 kDathan the core+1 protein produced in vitro.

REFERENCES

-   1. Choo Q L, Kuo G, Wiener A J, et al. (1989). Isolation of a cDNA    clone derived from a blood-borne non-A, non-B viral hepatitis    genome. Science, 244, 359-362.-   2. Di Bisceglie A M. (2000) Hepathology, 31, 1014-1018.-   3. Houghton M. (1996) Hepatitis C virus, Fields, ed.-   4. Reed K E, and Rice C M. (2000). Curr.Top.Microbiol.Immunol. 242,    55-84.-   5. Xu Z, Choi J, Yen T S, Lu W, Strohecker A, Govindarajan S, Chien    D, Selby M, Ou J. (2001). Synthesis of a novel hepatitis C virus    protein by ribosomal frameshift. EMBO J, 20, 3840-3848.-   6. Varaklioti A, Vassilaki N, Georgopoulou U, Mavromara P. (2002) J.    Biol. Chem. 277, 17713-17721.-   7. U.S. Ser. No. 09/644,987 filed on Aug. 24, 2000 “Nucleic acids    and new polypeptides associated with and/or overlapping with    Hepatitis C virus core gene products”.-   8. Devereux et al. (1984). Nucl. Acids Res. 12:387.-   9. Psaridi L, Georgopoulou U, Varaklioti A, Mavromara P. (1999) FEBS    Lett. 453, 49-53.-   10. Inchauspe G, Zebedee D H, Lee M, Sugitani M, Nasoff and A M    Prince. (1991). Genomic structure of the human prototype strain H of    Hepatitis C virus: comparison with American and Japanese isolates.    Proc. Natl. Acad. Sci. USA 88:10292-10296.

1. A shorter form core+1 protein of Hepatitis C virus (HCV) which is theproduct of translation of a coding sequence consisting of all or part ofa nucleotide sequence extending from nucleotide 598 to nucleotide 920within the core+1 ORF of HCV represented on FIG. 3B.
 2. The shorter formcore+1 protein according to claim 1, which is encoded by a nucleotidesequence having a translation initiation codon (ATG) at position 598 orby a nucleotide sequence having an ATG at position 606 of the HCV core+1coding sequence.
 3. The shorter form core+1 protein according to claim 1or 2, which is encoded by: (i) a nucleotide sequence extending fromnucleotide 598 to nucleotide 826 of the sequence represented on FIG. 3B;or (ii) a nucleotide sequence extending from nucleotide 598 tonucleotide 897 of the sequence represented on FIG. 3B; or (iii) anucleotide sequence extending from nucleotide 606 to nucleotide 826 ofthe sequence represented on FIG. 3B; or (iv) a nucleotide sequenceextending from nucleotide 606 to nucleotide 897 of the sequencerepresented on FIG. 3B; or (v) a nucleotide sequence extending fromnucleotide 606 to nucleotide 920 of the sequence represented on FIG. 3B.4. A shorter form core+1 protein of HCV obtainable in vivo by expressionin transfected cells of the core+1 open reading frame (ORF) which iscontained in nucleotide sequence extending from nucleotide at position342 to nucleotide at position 920 of the nucleotide sequence representedon FIG. 3B, and which molecular weight is less than 10 kDa.
 5. Theshorter form core+1 protein according to claim 4, which is theexpression product of the core+1 ORF in mammalian cells.
 6. The shorterform core+1 protein according to anyone of claims 1 to 5, which isrecognized by a serum of patients infected with HCV.
 7. The shorter formcore+1 protein according to anyone of claims 1 to 6 which comprisesamino-acid sequence extending from amino-acid residue corresponding tonucleotide 598 to amino-acid residue corresponding to nucleotide 826, orto nucleotide 897, or to nucleotide
 920. 8. The shorter form core+1protein according to anyone of claims 1 to 6 which comprises amino-acidsequence extending from amino-acid residue corresponding to nucleotide606 to amino-acid residue corresponding to nucleotide 826, or tonucleotide 897, or to nucleotide
 920. 9. A peptide contained within theshorter form core+1 protein according to anyone of claims 1 to 8 whichcomprises an epitope.
 10. The peptide according to claim 9, whichcomprises the following amino-acid sequence:COOH-T-Y—R-S-S-A-P-L-L-E-A-L-P-G-P—NH₂ or a variant thereof which reactswith antibodies directed against said peptide sequence.
 11. The peptidevariant according to claim 10, which is derived from the sequence ofFIG.
 8. 12. A nucleotide sequence consisting in a fragment of thenucleotide sequence extending from nucleotide 342 to nucleotide 920represented on FIG. 3B, which fragment is capable of encoding a shorterform core+1 protein of HCV when transfected in mammalian cells underexpression conditions.
 13. A nucleotide sequence encoding a shorter formcore+1 protein according to anyone of claim 7 or
 8. 14. The nucleotidesequence according to claim 12 or 13 comprising a nucleotide sequenceextending from nucleotide 598 or from nucleotide 606 to nucleotide 826within the core+1 coding sequence represented on FIG.
 36. 15. Thenucleotide sequence, which is chosen among: (i) a nucleotide sequenceextending from nucleotide 606 to nucleotide 826 of the sequencerepresented on FIG. 3B; (ii) a nucleotide sequence extending fromnucleotide 606 to nucleotide 897 of the sequence represented on FIG. 3B;(iii) a nucleotide sequence extending from nucleotide 606 to nucleotide920 of the sequence represented on FIG. 3B; (iv) a nucleotide sequenceextending from nucleotide 598 to nucleotide 826 of the sequencerepresented on FIG. 3B; (v) a nucleotide sequence extending fromnucleotide 598 to nucleotide 897 of the sequence represented on FIG. 3B;(vi) a nucleotide sequence extending from nucleotide 598 to nucleotide920 of the sequence represented on FIG. 3B; (vii) a fragment of sequence(i), (ii), (iii), (iv), (v), or (vi) which is capable of encoding ashorter form core+1 protein according to anyone of claims 1 to 8 inmammalian cells or an epitope thereof.
 16. A nucleotide sequencecomprising a HCV core protein coding sequence, which is derived from thenucleotide sequence represented on FIG. 3B as a result of one or severalmutations selected among the following: in 9^(th) and 11^(th) codons amutation which respectively corresponds to a triple substitution of twoA to G and of an A to C; or in 9^(th), 10^(th) and 11^(th) codons amutation which respectively consists of a substitution of one A to G andtwo A to C; or in 9^(th) codon a mutation which consists of asubstitution of A to G; or in 10^(th) codon a mutation which consists ofa substitution of A to C or a substitution of an initiator codon into aterminator codon; or a substitution of the 25^(th) codon into a stopcodon; or a substitution of the 43^(rd) codon into a stop codon; or asubstitution of the 79^(th) codon into a stop codon; or a substitutionof the 87^(th) codon into a stop codon; or a substation of the 85^(th)codon into a stop codon and/or a substitution of the 87^(th) codon intoa stop codon.
 17. The nucleotide sequence according to any of claims 12to 15, said sequence being a functional variant thereof having at least70% identity.
 18. A nucleotide sequence hybridizing under stringentconditions to a nucleotide sequence according to anyone of claims 12 to17.
 19. A nucleotide sequence which is a sequence complementary to anucleotide sequence according to anyone of claims 12 to
 18. 20. Anucleotide sequence hybridizing under stringent conditions with at leasta complementary sequence of a nucleotide sequence according to anyone ofclaims 12 to
 17. 21. A chimeric gene comprising a promoter operativelylinked to a nucleotide sequence according to any of claims 12 to 15 and17 to
 20. 22. The chimeric gene according to claim 21, wherein saidpromoter is selected from the group consisting of lactose promotersystem, tryptophan promoter system, tac promoter and CMV promoter 23.The chimeric gene according to claim 21, comprising a CMV/T7 promoterand a chloramphenicol acetyl transferase (CAT) gene in a first cistronand the entire IRES of the HCV core coding sequence and part of the wildtype core coding sequence fused to LUC gene in a second cistron.
 24. Thechimeric gene according to claim 23, wherein LUC gene is fused to thecore sequence in a 0, +1 or −1 frames.
 25. A vector comprising achimeric gene according to anyone of claims 21 to
 24. 26. The vector ofclaim 24 which, is a plasmid, a cosmid, a phage or a virus.
 27. Thevector according to claim 24 is preferably a plasmid selected from thegroup consisting of pHPI-1333, pHPI-1335 represented on FIG.
 1. 28.Recombinant cells transfected with a nucleotide sequence according toanyone of claims 1 to
 27. 29. The recombinant cells according to claim28, which are animal, mammalian or human cells.
 30. The recombinantcells according to claim 29, which are BHK-21 or Huh-1 cells. 31.Purified antibodies which specifically bind to shorter form core+1protein according to anyone of claims 1 to 8, without cross-reactingwith core protein and/or core+1 protein.
 32. Purified antibodies whichspecifically bind to polypeptide fragments common for shorter formcore+1 protein according to anyone of claims 1 to 8, and core+1 proteinand optionally core protein.
 33. Purified antibodies which specificallybind to peptide according to anyone of claims 9 to
 11. 34. Purifiedantibodies according to anyone of claims 31 to 33, which are monoclonalantibodies.
 35. A method for producing antibodies, wherein the shorterform core+1 protein according to anyone of claims 1 to 8, or fragmentthereof is used as antigen.
 36. A purified polypeptide whichspecifically binds to at least one antibody according to anyone ofclaims 31 to 34 or to an antibody produced by the method according toclaim
 35. 37. An in vitro method for the detection of infection byHepatitis C virus, in a biological sample, said method comprisingdetermining the presence or absence of the shorter form core+1 proteinaccording to anyone of claims 1 to
 8. 38. The method according to claim37, wherein said shorter form core+1 protein is detected with antibodieswhich are immunologicaly reactive with the shorter form core+1 proteinaccording to anyone of claims 1 to
 8. 39. The method according to claim37, wherein said shorter form core+1 protein is detected with antibodieswhich are immunologicaly reactive with the peptide according to anyoneof claims 9 to
 11. 40. A method for the in vitro detection of infectionby Hepatitis C virus, which comprises detecting antibodies recognizingthe shorter form core+1 protein according to anyone of claims 1 to 8, ina biological sample.
 41. The method according to anyone of claims 37 to40, wherein the formation of antigen-antibody complex is detected byimmunoassay (direct detection) or ELISA (indirect detection).
 42. Amethod of screening compounds for their capacity to interact with viralpropagation in cells infected by HCV, said method comprising: a.contacting said cells with a candidate compound; b. determininginteraction between said candidate compound and expression of saidshorter form core+1 protein.
 43. The method of screening compoundsaccording to claim 42 wherein interaction is determined by measuring theexpression level of shorter form core+1 protein, prior and aftercontacting the HCV infected cells with a candidate compound.
 44. Themethod according to claim 42 or 43, wherein said cells infected by HCVare animal, mammalian or human cells.
 45. The method according to claim44, wherein said cells infected by HCV are BHK-21 or Huh-1 cells. 46.Use of a compound selected according to the method of anyone of claims38 to 41 for the preparation of a medicine for the treatment ofdisorders induced by or associated with infection of HCV.